mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-03-27 03:00:36 +00:00
proxy config
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@186 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
parent
b6de623f4e
commit
b6b2fdcc3c
23 changed files with 105 additions and 378 deletions
|
|
@ -5,6 +5,6 @@ build-stamp
|
|||
*.o
|
||||
build
|
||||
dist
|
||||
linkchecker-*.tar.gz
|
||||
MANIFEST
|
||||
VERSION
|
||||
LinkCheckerConf.py
|
||||
|
|
|
|||
5
INSTALL
5
INSTALL
|
|
@ -5,12 +5,9 @@ Requirements
|
|||
------------
|
||||
Python >= 2.0 from http://www.python.org/
|
||||
|
||||
|
||||
Setup
|
||||
-----
|
||||
Run "python setup.py config" to configure.
|
||||
Run "python setup.py install --prefix=<prefix>" to install.
|
||||
Default prefix is /usr.
|
||||
Run "python setup.py install" to install.
|
||||
Run "python setup.py --help" for help.
|
||||
Debian users can build the .deb package with "debian/rules binary" as
|
||||
root or "fakeroot debian/rules binary" as a normal user.
|
||||
|
|
|
|||
21
Makefile
21
Makefile
|
|
@ -1,16 +1,12 @@
|
|||
# This Makefile is only used by developers.
|
||||
# You will need a Debian Linux system to use this Makefile!
|
||||
VERSION=$(shell python setup.py --version)
|
||||
PACKAGE = linkchecker
|
||||
NAME = $(shell python setup.py --name)
|
||||
PACKAGE=linkchecker
|
||||
NAME=$(shell python setup.py --name)
|
||||
HOST=treasure.calvinsplayground.de
|
||||
PROXY=--proxy= -itreasure.calvinsplayground.de -s
|
||||
#PROXY=-P$(HOST):8080
|
||||
#HOST=fsinfo.cs.uni-sb.de
|
||||
#PROXY=-Pwww-proxy.uni-sb.de:3128
|
||||
LCOPTS=-ocolored -Ftext -Fhtml -Fgml -Fsql -Fcsv -R -t0 -v
|
||||
DEBPACKAGE = $(PACKAGE)_$(VERSION)_i386.deb
|
||||
SOURCES = \
|
||||
LCOPTS=-ocolored -Ftext -Fhtml -Fgml -Fsql -Fcsv -R -t0 -v -itreasure.calvinsplayground.de -s
|
||||
DEBPACKAGE=$(PACKAGE)_$(VERSION)_i386.deb
|
||||
SOURCES=\
|
||||
linkcheck/Config.py \
|
||||
linkcheck/FileUrlData.py \
|
||||
linkcheck/FtpUrlData.py \
|
||||
|
|
@ -22,7 +18,6 @@ linkcheck/JavascriptUrlData.py \
|
|||
linkcheck/Logging.py \
|
||||
linkcheck/MailtoUrlData.py \
|
||||
linkcheck/NntpUrlData.py \
|
||||
linkcheck/RobotsTxt.py \
|
||||
linkcheck/TelnetUrlData.py \
|
||||
linkcheck/Threader.py \
|
||||
linkcheck/UrlData.py \
|
||||
|
|
@ -45,9 +40,7 @@ distclean: clean
|
|||
|
||||
dist: mo
|
||||
rm -rf debian/tmp
|
||||
python setup.py sdist --formats=gztar,zip bdist_rpm
|
||||
# extra run without SSL compilation
|
||||
python setup.py bdist_wininst
|
||||
python setup.py sdist --formats=gztar,zip bdist_rpm bdist_wininst
|
||||
fakeroot debian/rules binary
|
||||
mv -f ../$(DEBPACKAGE) dist
|
||||
|
||||
|
|
@ -55,7 +48,7 @@ package:
|
|||
cd dist && dpkg-scanpackages . ../override.txt | gzip --best > Packages.gz
|
||||
|
||||
files:
|
||||
./$(PACKAGE) $(LCOPTS) $(PROXY) -i$(HOST) http://$(HOST)/~calvin/
|
||||
./$(PACKAGE) $(LCOPTS) -i$(HOST) http://$(HOST)/~calvin/
|
||||
|
||||
VERSION:
|
||||
echo $(VERSION) > VERSION
|
||||
|
|
|
|||
13
README
13
README
|
|
@ -11,12 +11,12 @@ o output can be colored or normal text, HTML, SQL, CSV or a GML sitemap
|
|||
graph
|
||||
o HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet and local
|
||||
file links are supported.
|
||||
Javascript links are currently ignored
|
||||
o restrict link checking with regular expression filters for URLs
|
||||
o HTTP proxy support
|
||||
o proxy support
|
||||
o give username/password for HTTP and FTP authorization
|
||||
o robots.txt exclusion protocol support
|
||||
o internationalization support
|
||||
o i18n support
|
||||
o command line interface
|
||||
o (Fast)CGI web interface
|
||||
|
||||
|
||||
|
|
@ -31,8 +31,7 @@ LinkChecker is licensed under the GNU Public License.
|
|||
Credits go to Guido van Rossum for making Python. His hovercraft is
|
||||
full of eels!
|
||||
As this program is directly derived from my Java link checker, additional
|
||||
credits go to Robert Forsman (the author of JCheckLinks) and his
|
||||
robots.txt parse algorithm.
|
||||
credits go to Robert Forsman (the author of JCheckLinks).
|
||||
I want to thank everybody who gave me feedback, bug reports and
|
||||
suggestions.
|
||||
|
||||
|
|
@ -49,14 +48,10 @@ So for example 1.1.5 is the fifth release of the 1.1 development package.
|
|||
|
||||
Included packages
|
||||
-----------------
|
||||
httplib from http://www.lyra.org/greg/python/
|
||||
httpslib from http://home.att.net/~nvsoft1/ssl_wrapper.html
|
||||
DNS see DNS/README
|
||||
fcgi.py and sz_fcgi.py from http://saarland.sz-sb.de/~ajung/sz_fcgi/
|
||||
fintl.py from http://sourceforge.net/snippet/detail.php?type=snippet&id=100059
|
||||
|
||||
Note that the following packages are modified by me:
|
||||
httplib.py (renamed to http11lib.py and a bug fixed)
|
||||
fcgi.py (implemented streamed output)
|
||||
sz_fcgi.py (simplified the code)
|
||||
DNS/Lib.py:566 fixed rdlength name error
|
||||
|
|
|
|||
4
debian/changelog
vendored
4
debian/changelog
vendored
|
|
@ -4,8 +4,10 @@ linkchecker (1.3.0) unstable; urgency=low
|
|||
and use the one provided within the Python library
|
||||
* added <script src=> urls for link testing. Thanks to Tomas Cox
|
||||
<cox@idecnet.com> for the suggestion
|
||||
* we get now all proxy configuration values from $http_proxy,
|
||||
$https_proxy on Unix,Windows and from Internet Config on the Mac
|
||||
|
||||
-- Bastian Kleineidam <calvin@users.sourceforge.net> Wed, 1 Nov 2000 22:07:25 +0100
|
||||
-- Bastian Kleineidam <calvin@users.sourceforge.net> Thu, 2 Nov 2000 11:17:16 +0100
|
||||
|
||||
linkchecker (1.2.6) unstable; urgency=low
|
||||
|
||||
|
|
|
|||
208
fintl.py
208
fintl.py
|
|
@ -1,208 +0,0 @@
|
|||
## vim:ts=4:et:nowrap
|
||||
"""i18n (multiple language) support. Reads .mo files from GNU gettext msgfmt
|
||||
|
||||
If you want to prepare your Python programs for i18n you could simply
|
||||
add the following lines to the top of a BASIC_MAIN module of your py-program:
|
||||
try:
|
||||
import fintl
|
||||
gettext = fintl.gettext
|
||||
fintl.bindtextdomain(YOUR_PROGRAM, YOUR_LOCALEDIR)
|
||||
fintl.textdomain(YOUR_PROGRAM)
|
||||
except ImportError:
|
||||
def gettext(msg):
|
||||
return msg
|
||||
_ = gettext
|
||||
and/or also add the following to the top of any module containing messages:
|
||||
import BASIC_MAIN
|
||||
_ = BASIC_MAIN.gettext
|
||||
|
||||
Now you could use _("....") everywhere instead of "...." for message texts.
|
||||
|
||||
Once you have written your internationalized program, you can use
|
||||
the suite of utility programs contained in the GNU gettext package to aid
|
||||
the translation into other languages.
|
||||
|
||||
You ARE NOT REQUIRED to release the sourcecode of your program, since
|
||||
linking of your program against GPL code is avoided by this module.
|
||||
Although it is possible to use the GNU gettext library by using the
|
||||
*intl.so* module written by Martin von Löwis if this is available. But it is
|
||||
not required to use it in the first place.
|
||||
"""
|
||||
# Copyright 1999 by <mailto: pf@artcom-gmbh.de> (Peter Funk)
|
||||
#
|
||||
# All Rights Reserved
|
||||
#
|
||||
# Permission to use, copy, modify, and distribute this software and its
|
||||
# documentation for any purpose and without fee is hereby granted,
|
||||
# provided that the above copyright notice appear in all copies.
|
||||
|
||||
# ArtCom GmbH AND Peter Funk DISCLAIMS ALL WARRANTIES WITH REGARD TO
|
||||
# THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
|
||||
# AND FITNESS, IN NO EVENT SHALL ArtCom GmBH or Peter Funk BE LIABLE
|
||||
# FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
|
||||
# AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
|
||||
# OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
|
||||
_default_localedir = '/usr/share/locale'
|
||||
_default_domain = 'python'
|
||||
|
||||
# check out, if Martin v. Löwis 'intl' module interface to the GNU gettext
|
||||
# library is available and use it only, if it is available:
|
||||
try:
|
||||
from intl import *
|
||||
except ImportError:
|
||||
# now do what the gettext library provides in pure Python:
|
||||
error = 'fintl.error'
|
||||
# some globals preserving state:
|
||||
_languages = []
|
||||
_default_mo = None # This is default message outfile used by 'gettext'
|
||||
_loaded_mos = {} # This is a dictionary of loaded message output files
|
||||
|
||||
# some small little helper routines:
|
||||
def _check_env():
|
||||
"""examine language enviroment variables and return list of languages"""
|
||||
# TODO: This should somehow try to find out locale information on
|
||||
# Non-unix platforms like WinXX and MacOS. Suggestions welcome!
|
||||
languages = []
|
||||
import os, string
|
||||
for envvar in ('LANGUAGE', 'LC_ALL', 'LC_MESSAGES', 'LANG'):
|
||||
if os.environ.has_key(envvar):
|
||||
languages = string.split(os.environ[envvar], ':')
|
||||
break
|
||||
# use locale 'C' as default fallback:
|
||||
if 'C' not in _languages:
|
||||
languages.append('C')
|
||||
return languages
|
||||
|
||||
# Utility function used to decode binary .mo file header and seek tables:
|
||||
def _decode_Word(bin):
|
||||
# This assumes little endian (intel, vax) byte order.
|
||||
return ord(bin[0]) + (ord(bin[1]) << 8) + \
|
||||
(ord(bin[2]) << 16) + (ord(bin[3]) << 24)
|
||||
|
||||
# Now the methods designed to be used from outside:
|
||||
|
||||
def gettext(message):
|
||||
"""return localized version of a 'message' string"""
|
||||
if _default_mo is None:
|
||||
textdomain()
|
||||
return _default_mo.gettext(message)
|
||||
|
||||
_ = gettext
|
||||
|
||||
def dgettext(domain, message):
|
||||
"""like gettext but looks up 'message' in a special 'domain'"""
|
||||
# This may useful for larger software systems
|
||||
if not _loaded_mos.has_key(domain):
|
||||
raise error, "No '" + domain + "' message domain"
|
||||
return _loaded_mos[domain].gettext(message)
|
||||
|
||||
class _MoDict:
|
||||
"""read a .mo file into a python dictionary"""
|
||||
__MO_MAGIC = 0x950412de # Magic number of .mo files
|
||||
def __init__(self, domain=_default_domain, localedir=_default_localedir):
|
||||
global _languages
|
||||
self.catalog = {}
|
||||
self.domain = domain
|
||||
self.localedir = localedir
|
||||
# delayed access to environment variables:
|
||||
if not _languages:
|
||||
_languages = _check_env()
|
||||
for self.lang in _languages:
|
||||
if self.lang == 'C':
|
||||
return
|
||||
mo_filename = "%s//%s/LC_MESSAGES/%s.mo" % (
|
||||
localedir, self.lang, domain)
|
||||
try:
|
||||
buffer = open(mo_filename, "rb").read()
|
||||
break
|
||||
except IOError:
|
||||
pass
|
||||
else:
|
||||
return # assume C locale
|
||||
# Decode the header of the .mo file (5 little endian 32 bit words):
|
||||
if _decode_Word(buffer[:4]) != self.__MO_MAGIC :
|
||||
raise error, '%s seems not be a valid .mo file' % mo_filename
|
||||
self.mo_version = _decode_Word(buffer[4:8])
|
||||
num_messages = _decode_Word(buffer[8:12])
|
||||
master_index = _decode_Word(buffer[12:16])
|
||||
transl_index = _decode_Word(buffer[16:20])
|
||||
buf_len = len(buffer)
|
||||
# now put all messages from the .mo file buffer in the catalog dict:
|
||||
for i in xrange(0, num_messages):
|
||||
start_master= _decode_Word(buffer[master_index+4:master_index+8])
|
||||
end_master = start_master + \
|
||||
_decode_Word(buffer[master_index:master_index+4])
|
||||
start_transl= _decode_Word(buffer[transl_index+4:transl_index+8])
|
||||
end_transl = start_transl + \
|
||||
_decode_Word(buffer[transl_index:transl_index+4])
|
||||
if end_master <= buf_len and end_transl <= buf_len:
|
||||
self.catalog[buffer[start_master:end_master]]=\
|
||||
buffer[start_transl:end_transl]
|
||||
else:
|
||||
raise error, ".mo file '%s' is corrupt" % mo_filename
|
||||
# advance to the next entry in seek tables:
|
||||
master_index= master_index + 8
|
||||
transl_index= transl_index + 8
|
||||
|
||||
def gettext(self, message):
|
||||
"""return the translation of a given message"""
|
||||
try:
|
||||
return self.catalog[message]
|
||||
except KeyError:
|
||||
return message
|
||||
# _MoDict instances may be also accessed using mo[msg] or mo(msg):
|
||||
__getitem = gettext
|
||||
__call__ = gettext
|
||||
|
||||
def textdomain(domain=_default_domain):
|
||||
"""Sets the 'domain' to be used by this program. Defaults to 'python'"""
|
||||
global _default_mo
|
||||
if not _loaded_mos.has_key(domain):
|
||||
_loaded_mos[domain] = _MoDict(domain)
|
||||
_default_mo = _loaded_mos[domain]
|
||||
|
||||
def bindtextdomain(domain, localedir=_default_localedir):
|
||||
global _default_mo
|
||||
if not _loaded_mos.has_key(domain):
|
||||
_loaded_mos[domain] = _MoDict(domain, localedir)
|
||||
if _default_mo is not None:
|
||||
_default_mo = _loaded_mos[domain]
|
||||
|
||||
def translator(domain=_default_domain, localedir=_default_localedir):
|
||||
"""returns a gettext compatible function object
|
||||
|
||||
which is bound to the domain given as parameter"""
|
||||
pass # TODO implement this
|
||||
|
||||
def _testdriver(argv):
|
||||
message = ""
|
||||
domain = _default_domain
|
||||
localedir = _default_localedir
|
||||
if len(argv) > 1:
|
||||
message = argv[1]
|
||||
if len(argv) > 2:
|
||||
domain = argv[2]
|
||||
if len(argv) > 3:
|
||||
localedir = argv[3]
|
||||
# now perform some testing of this module:
|
||||
bindtextdomain(domain, localedir)
|
||||
textdomain(domain)
|
||||
info = gettext('') # this is where special info is often stored
|
||||
if info:
|
||||
print ".mo file for domain %s in %s contains:" % (domain, localedir)
|
||||
print info
|
||||
else:
|
||||
print ".mo file contains no info"
|
||||
if message:
|
||||
print "Translation of '"+ message+ "' is '"+ _(message)+ "'"
|
||||
else:
|
||||
for msg in ("Cancel", "No", "OK", "Quit", "Yes"):
|
||||
print "Translation of '"+ msg + "' is '"+ _(msg)+ "'"
|
||||
|
||||
if __name__ == '__main__':
|
||||
import sys
|
||||
if len(sys.argv) > 1 and (sys.argv[1] == "-h" or sys.argv[1] == "-?"):
|
||||
print "Usage :", sys.argv[0], "[ MESSAGE [ DOMAIN [ LOCALEDIR ]]]"
|
||||
_testdriver(sys.argv)
|
||||
|
|
@ -21,9 +21,11 @@ This module stores
|
|||
* Other configuration options
|
||||
"""
|
||||
|
||||
import ConfigParser,sys,os,re,UserDict,string,time,Logging,LinkCheckerConf
|
||||
import ConfigParser,sys,os,re,UserDict,string,time
|
||||
import Logging,LinkCheckerConf
|
||||
from os.path import expanduser,normpath,normcase,join,isfile
|
||||
from types import StringType
|
||||
from urllib import getproxies
|
||||
from linkcheck import _
|
||||
|
||||
Version = LinkCheckerConf.version
|
||||
|
|
@ -92,8 +94,7 @@ class Configuration(UserDict.UserDict):
|
|||
self.data["authentication"] = [(re.compile(r'^.+'),
|
||||
'anonymous',
|
||||
'joe@')]
|
||||
self.data["proxy"] = 0
|
||||
self.data["proxyport"] = 8080
|
||||
self.data["proxy"] = getproxies()
|
||||
self.data["recursionlevel"] = 1
|
||||
self.data["robotstxt"] = 0
|
||||
self.data["strict"] = 0
|
||||
|
|
@ -376,11 +377,8 @@ class Configuration(UserDict.UserDict):
|
|||
used in the linkchecker module.
|
||||
"""
|
||||
debug("DEBUG: reading configuration from %s\n" % files)
|
||||
try:
|
||||
cfgparser = ConfigParser.ConfigParser()
|
||||
cfgparser.read(files)
|
||||
except ConfigParser.Error:
|
||||
return
|
||||
cfgparser = ConfigParser.ConfigParser()
|
||||
cfgparser.read(files)
|
||||
|
||||
section="output"
|
||||
try:
|
||||
|
|
@ -389,16 +387,16 @@ class Configuration(UserDict.UserDict):
|
|||
self.data['log'] = self.newLogger(log)
|
||||
else:
|
||||
self.warn(_("invalid log option '%s'") % log)
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
if cfgparser.getboolean(section, "verbose"):
|
||||
self.data["verbose"] = 1
|
||||
self.data["warnings"] = 1
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["quiet"] = cfgparser.getboolean(section, "quiet")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["warnings"] = cfgparser.getboolean(section, "warnings")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
filelist = string.split(cfgparser.get(section, "fileoutput"))
|
||||
for arg in filelist:
|
||||
|
|
@ -406,12 +404,12 @@ class Configuration(UserDict.UserDict):
|
|||
if Loggers.has_key(arg) and arg != "blacklist":
|
||||
self.data['fileoutput'].append(
|
||||
self.newLogger(arg, {'fileoutput':1}))
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
for key in Loggers.keys():
|
||||
if cfgparser.has_section(key):
|
||||
for opt in cfgparser.options(key):
|
||||
try: self.data[key][opt] = cfgparser.get(key, opt)
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
|
||||
section="checking"
|
||||
try:
|
||||
|
|
@ -420,32 +418,28 @@ class Configuration(UserDict.UserDict):
|
|||
self.disableThreads()
|
||||
else:
|
||||
self.enableThreads(num)
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["anchors"] = cfgparser.getboolean(section, "anchors")
|
||||
except ConfigParser.Error: pass
|
||||
try:
|
||||
self.data["proxy"] = cfgparser.get(section, "proxy")
|
||||
self.data["proxyport"] = cfgparser.getint(section, "proxyport")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
num = cfgparser.getint(section, "recursionlevel")
|
||||
if num<0:
|
||||
self.error(_("illegal recursionlevel number %d") % num)
|
||||
self.data["recursionlevel"] = num
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
self.data["robotstxt"] = cfgparser.getboolean(section,
|
||||
"robotstxt")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["strict"] = cfgparser.getboolean(section, "strict")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
self.data["warningregex"] = re.compile(cfgparser.get(section,
|
||||
"warningregex"))
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try:
|
||||
self.data["nntpserver"] = cfgparser.get(section, "nntpserver")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
|
||||
section = "authentication"
|
||||
try:
|
||||
|
|
@ -456,7 +450,7 @@ class Configuration(UserDict.UserDict):
|
|||
tuple[0] = re.compile(tuple[0])
|
||||
self.data["authentication"].append(tuple)
|
||||
i = i + 1
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
|
||||
section = "filtering"
|
||||
try:
|
||||
|
|
@ -467,9 +461,9 @@ class Configuration(UserDict.UserDict):
|
|||
self.data["externlinks"].append((re.compile(tuple[0]),
|
||||
int(tuple[1])))
|
||||
i = i + 1
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["internlinks"].append(re.compile(cfgparser.get(section, "internlinks")))
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
try: self.data["allowdeny"] = cfgparser.getboolean(section, "allowdeny")
|
||||
except ConfigParser.Error: pass
|
||||
except ConfigParser.NoOptionError: pass
|
||||
|
||||
|
|
|
|||
|
|
@ -64,6 +64,5 @@ class FileUrlData(UrlData):
|
|||
return self.valid and html_re.search(self.url)
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "File link\n"+UrlData.__str__(self)
|
||||
|
||||
def get_scheme(self):
|
||||
return "file"
|
||||
|
|
|
|||
|
|
@ -27,9 +27,7 @@ ExcList.extend([
|
|||
])
|
||||
|
||||
class FtpUrlData(UrlData):
|
||||
"""
|
||||
Url link with ftp scheme.
|
||||
"""
|
||||
"""Url link with ftp scheme."""
|
||||
|
||||
def checkConnection(self, config):
|
||||
_user, _password = self._getUserPassword(config)
|
||||
|
|
@ -47,7 +45,5 @@ class FtpUrlData(UrlData):
|
|||
except: pass
|
||||
self.urlConnection = None
|
||||
|
||||
def __str__(self):
|
||||
return "FTP link\n"+UrlData.__str__(self)
|
||||
|
||||
|
||||
def get_scheme(self):
|
||||
return "ftp"
|
||||
|
|
|
|||
|
|
@ -21,5 +21,5 @@ from linkcheck import _
|
|||
class GopherUrlData(UrlData):
|
||||
"Url link with gopher scheme"
|
||||
|
||||
def __str__(self):
|
||||
return "Gopher link\n"+UrlData.__str__(self)
|
||||
def get_scheme(self):
|
||||
return "gopher"
|
||||
|
|
|
|||
|
|
@ -37,12 +37,8 @@ class HostCheckingUrlData(UrlData):
|
|||
self.urlTuple=None
|
||||
|
||||
def getCacheKey(self):
|
||||
return self.host
|
||||
return self.get_scheme()+":"+self.host
|
||||
|
||||
def checkConnection(self, config):
|
||||
ip = socket.gethostbyname(self.host)
|
||||
self.setValid(self.host+"("+ip+") "+_("found"))
|
||||
|
||||
def __str__(self):
|
||||
return "host="+`self.host`+"\n"+UrlData.__str__(self)
|
||||
|
||||
|
|
|
|||
|
|
@ -16,14 +16,18 @@
|
|||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
"""
|
||||
import httplib,urlparse,sys,time,re,robotparser
|
||||
from UrlData import UrlData
|
||||
import Config,StringUtil
|
||||
from UrlData import UrlData
|
||||
from urllib import splittype, splithost
|
||||
from linkcheck import _
|
||||
|
||||
class HttpUrlData(UrlData):
|
||||
"Url link with http scheme"
|
||||
netscape_re = re.compile("Netscape-Enterprise/")
|
||||
|
||||
def get_scheme(self):
|
||||
return "http"
|
||||
|
||||
def checkConnection(self, config):
|
||||
"""
|
||||
Check a URL with HTTP protocol.
|
||||
|
|
@ -63,11 +67,12 @@ class HttpUrlData(UrlData):
|
|||
| "503" ; Service Unavailable
|
||||
| extension-code
|
||||
"""
|
||||
|
||||
self.proxy = config['proxy'].get(self.get_scheme(), None)
|
||||
if self.proxy:
|
||||
self.proxy = splittype(self.proxy)[1]
|
||||
self.proxy = splithost(self.proxy)[0]
|
||||
self.mime = None
|
||||
self.auth = None
|
||||
self.proxy = config["proxy"]
|
||||
self.proxyport = config["proxyport"]
|
||||
if not self.urlTuple[2]:
|
||||
self.setWarning(_("Missing '/' at end of URL"))
|
||||
if config["robotstxt"] and not self.robotsTxtAllowsUrl(config):
|
||||
|
|
@ -150,9 +155,8 @@ class HttpUrlData(UrlData):
|
|||
def _getHttpRequest(self, method="HEAD"):
|
||||
"Put request and return (status code, status text, mime object)"
|
||||
if self.proxy:
|
||||
Config.debug("DEBUG: using proxy %s:%d\n" %
|
||||
(self.proxy,self.proxyport))
|
||||
host = '%s:%d' % (self.proxy, self.proxyport)
|
||||
Config.debug("DEBUG: using proxy %s\n" % self.proxy)
|
||||
host = self.proxy
|
||||
else:
|
||||
host = self.urlTuple[1]
|
||||
if self.urlConnection:
|
||||
|
|
@ -204,9 +208,6 @@ class HttpUrlData(UrlData):
|
|||
return config.robotsTxtCache_get(self.url)
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "HTTP link\n"+UrlData.__str__(self)
|
||||
|
||||
def closeConnection(self):
|
||||
if self.mime:
|
||||
try: self.mime.close()
|
||||
|
|
|
|||
|
|
@ -24,15 +24,6 @@ _supportHttps=hasattr(socket, 'ssl')
|
|||
class HttpsUrlData(HttpUrlData):
|
||||
"""Url link with https scheme"""
|
||||
|
||||
def __init__(self,
|
||||
urlName,
|
||||
recursionLevel,
|
||||
parentName = None,
|
||||
baseRef = None,
|
||||
line = 0):
|
||||
HttpUrlData.__init__(self, urlName, recursionLevel,
|
||||
parentName, baseRef, line)
|
||||
|
||||
def _getHTTPObject(self, host):
|
||||
return httplib.HTTPS(host)
|
||||
|
||||
|
|
@ -43,5 +34,5 @@ class HttpsUrlData(HttpUrlData):
|
|||
self.setWarning(_("HTTPS not supported"))
|
||||
self.logMe(config)
|
||||
|
||||
def __str__(self):
|
||||
return "HTTPS link\n"+UrlData.__str__(self)
|
||||
def get_scheme(self):
|
||||
return "https"
|
||||
|
|
|
|||
|
|
@ -25,5 +25,5 @@ class JavascriptUrlData(UrlData):
|
|||
self.setWarning(_("Javascript url ignored"))
|
||||
self.logMe(config)
|
||||
|
||||
def __str__(self):
|
||||
return "Javascript link\n"+UrlData.__str__(self)
|
||||
def get_scheme(self):
|
||||
return "javascript"
|
||||
|
|
|
|||
|
|
@ -32,7 +32,10 @@ if os.name=='posix':
|
|||
|
||||
class MailtoUrlData(HostCheckingUrlData):
|
||||
"Url link with mailto scheme"
|
||||
|
||||
|
||||
def get_scheme(self):
|
||||
return "mailto"
|
||||
|
||||
def buildUrl(self):
|
||||
HostCheckingUrlData.buildUrl(self)
|
||||
self.headers = {}
|
||||
|
|
@ -116,10 +119,4 @@ class MailtoUrlData(HostCheckingUrlData):
|
|||
|
||||
|
||||
def getCacheKey(self):
|
||||
return "mailto:"+str(self.adresses)
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "Mailto link\n"+HostCheckingUrlData.__str__(self)
|
||||
|
||||
|
||||
return self.get_scheme()+":"+str(self.adresses)
|
||||
|
|
|
|||
|
|
@ -28,7 +28,10 @@ ExcList.extend([nntplib.error_reply,
|
|||
|
||||
class NntpUrlData(UrlData):
|
||||
"Url link with NNTP scheme"
|
||||
|
||||
|
||||
def get_scheme(self):
|
||||
return "nntp"
|
||||
|
||||
def buildUrl(self):
|
||||
# use nntp instead of news to comply with the unofficial internet
|
||||
# draft of Alfred Gilman which unifies (s)news and nntp URLs
|
||||
|
|
@ -87,8 +90,3 @@ class NntpUrlData(UrlData):
|
|||
|
||||
def getCacheKey(self):
|
||||
return self.url
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "NNTP link\n"+self.urlName
|
||||
|
||||
|
|
|
|||
|
|
@ -31,17 +31,10 @@ class TelnetUrlData(HostCheckingUrlData):
|
|||
raise linkcheck.error, _("Illegal telnet link syntax")
|
||||
self.host = string.lower(self.urlName[7:])
|
||||
|
||||
def get_scheme(self):
|
||||
return "telnet"
|
||||
|
||||
def checkConnection(self, config):
|
||||
HostCheckingUrlData.checkConnection(self, config)
|
||||
self.urlConnection = telnetlib.Telnet()
|
||||
self.urlConnection.open(self.host, 23)
|
||||
|
||||
|
||||
def getCacheKey(self):
|
||||
return "telnet:"+HostCheckingUrlData.getCacheKey(self)
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "Telnet link\n"+HostCheckingUrlData.__str__(self)
|
||||
|
||||
|
|
|
|||
|
|
@ -91,8 +91,7 @@ class UrlData:
|
|||
self.extern = 1
|
||||
self.data = None
|
||||
self.html_comments = []
|
||||
|
||||
|
||||
|
||||
def setError(self, s):
|
||||
self.valid=0
|
||||
self.errorString = _("Error")+": "+s
|
||||
|
|
@ -347,12 +346,20 @@ class UrlData:
|
|||
return urls
|
||||
|
||||
|
||||
def get_scheme(self):
|
||||
return "no"
|
||||
|
||||
def __str__(self):
|
||||
return "urlname="+`self.urlName`+"\nparentName="+`self.parentName`+\
|
||||
"\nbaseRef="+`self.baseRef`+"\ncached="+`self.cached`+\
|
||||
"\nrecursionLevel="+`self.recursionLevel`+\
|
||||
"\nurlConnection="+str(self.urlConnection)+\
|
||||
"\nline="+`self.line`
|
||||
return """%s link
|
||||
urlname=%s
|
||||
parentName=%s
|
||||
baseRef=%s
|
||||
cached=%s
|
||||
recursionLevel=%s
|
||||
urlConnection=%s
|
||||
line=%s""" % \
|
||||
(self.get_scheme(), self.urlName, self.parentName, self.baseRef,
|
||||
self.cached, self.recursionLevel, self.urlConnection, self.line)
|
||||
|
||||
|
||||
def _getUserPassword(self, config):
|
||||
|
|
|
|||
|
|
@ -26,12 +26,12 @@ class error(Exception):
|
|||
# i18n suppport
|
||||
import LinkCheckerConf
|
||||
try:
|
||||
import fintl,os
|
||||
gettext = fintl.gettext
|
||||
import os
|
||||
from gettext import gettext, bindtextdomain, textdomain
|
||||
domain = 'linkcheck'
|
||||
localedir = os.path.join(LinkCheckerConf.install_data, 'locale')
|
||||
fintl.bindtextdomain(domain, localedir)
|
||||
fintl.textdomain(domain)
|
||||
bindtextdomain(domain, localedir)
|
||||
textdomain(domain)
|
||||
except ImportError:
|
||||
def gettext(msg):
|
||||
return msg
|
||||
|
|
|
|||
27
linkchecker
27
linkchecker
|
|
@ -2,8 +2,8 @@
|
|||
|
||||
# imports and checks
|
||||
import sys
|
||||
if sys.version[:5] < "1.5.2":
|
||||
raise SystemExit, "This program requires Python 1.5.2 or later."
|
||||
if (not hasattr(sys, 'version_info')) or sys.version_info < (2,0,0,'final',0):
|
||||
raise SystemExit, "This program requires Python 2.0 or later."
|
||||
import getopt,re,string,os,urlparse
|
||||
# 90 seconds timeout for all connections
|
||||
#import timeoutsocket
|
||||
|
|
@ -48,9 +48,6 @@ Usage = _("USAGE\tlinkchecker [options] file_or_url...\n"
|
|||
"-p pwd, --password=pwd\n"
|
||||
" Try given password for HTML and FTP authorization.\n"
|
||||
" Default is 'guest@'. See -u.\n"
|
||||
"-P host[:port], --proxy=host[:port]\n"
|
||||
" Use specified proxy for HTTP requests.\n"
|
||||
" Standard port is 8080. Default is to use no proxy.\n"
|
||||
"-q, --quiet\n"
|
||||
" Quiet operation. This is only useful with -F.\n"
|
||||
"-r depth, --recursion-level=depth\n"
|
||||
|
|
@ -95,6 +92,8 @@ Notes = _("NOTES\n"
|
|||
"o If your platform does not support threading, LinkChecker uses -t0\n"
|
||||
"o You can supply multiple user/password pairs in a configuration file\n"
|
||||
"o Cookies are not accepted by LinkChecker\n"
|
||||
"o To use proxies set $http_proxy, $https_proxy on Unix or Windows.\n"
|
||||
" On a Mac use the Internet Config.\n"
|
||||
"o When checking 'news:' links the given NNTP host doesn't need to be the\n"
|
||||
" same as the host of the user browsing your pages!\n")
|
||||
|
||||
|
|
@ -130,7 +129,7 @@ def printUsage(msg):
|
|||
try:
|
||||
# Note: cut out the name of the script
|
||||
options, args = getopt.getopt(sys.argv[1:],
|
||||
"aDe:f:F:hi:lN:P:o:p:qr:Rst:u:VvwW:", # short options
|
||||
"aDe:f:F:hi:lN:o:p:qr:Rst:u:VvwW:", # short options
|
||||
["anchors", # long options
|
||||
"config=",
|
||||
"debug",
|
||||
|
|
@ -141,7 +140,6 @@ try:
|
|||
"intern=",
|
||||
"allowdeny",
|
||||
"output=",
|
||||
"proxy=",
|
||||
"password=",
|
||||
"quiet",
|
||||
"recursion-level=",
|
||||
|
|
@ -169,13 +167,6 @@ for opt,arg in options:
|
|||
config.disableThreading()
|
||||
config.read(configfiles)
|
||||
|
||||
# if no proxy is given, fall back to http_proxy environment variable
|
||||
if os.environ.has_key('http_proxy') and not config['proxy']:
|
||||
config['proxy'] = urlparse.urlparse(os.environ["http_proxy"])[1]
|
||||
if string.find(config['proxy'], ':') != -1:
|
||||
config['proxy'],port = string.split(config['proxy'], ':')
|
||||
config['proxyport'] = int(port)
|
||||
|
||||
# apply options and arguments
|
||||
_user = "anonymous"
|
||||
_password = "guest@"
|
||||
|
|
@ -214,14 +205,6 @@ for opt,arg in options:
|
|||
elif opt=="-N" or opt=="--nntp-server":
|
||||
config["nntpserver"] = arg
|
||||
|
||||
elif opt=="-P" or opt=="--proxy":
|
||||
proxy = re.compile("(.+):(.+)").match(arg)
|
||||
if proxy:
|
||||
config["proxy"] = proxy.group(1)
|
||||
config["proxyport"] = int(proxy.group(2))
|
||||
else:
|
||||
config["proxy"] = arg
|
||||
|
||||
elif opt=="-p" or opt=="--password":
|
||||
_password = arg
|
||||
constructauth = 1
|
||||
|
|
|
|||
|
|
@ -79,9 +79,6 @@
|
|||
# overall strict checking. You can specify for each extern URL
|
||||
# separately if its strict or not. See the [filtering] section
|
||||
#strict=0
|
||||
# proxy parameters
|
||||
#proxy = www-proxy.uni-sb.de
|
||||
#proxyport = 3128
|
||||
# supply a regular expression for which warnings are printed if found
|
||||
# in any HTML files.
|
||||
#warningregex="Request failed"
|
||||
|
|
@ -116,4 +113,4 @@
|
|||
# At the moment, authentication is used/needed for http[s] and ftp links.
|
||||
[authentication]
|
||||
#entry1=^http://treasure\.calvinsplayground\.de/~calvin/isnichmehr/ lebowski lebowski
|
||||
#entry2=^ftp://void.cs.uni-sb.de calvin hutzli
|
||||
#entry2=^ftp://void.cs.uni-sb.de calvin schnuckl
|
||||
|
|
|
|||
|
|
@ -176,6 +176,8 @@ msgid ""
|
|||
"o If your platform does not support threading, LinkChecker uses -t0\n"
|
||||
"o You can supply multiple user/password pairs in a configuration file\n"
|
||||
"o Cookies are not accepted by LinkChecker\n"
|
||||
"o To use proxies set $http_proxy, $https_proxy on Unix or Windows.\n"
|
||||
" On a Mac use the Internet Config.\n"
|
||||
"o When checking 'news:' links the given NNTP host doesn't need to be the\n"
|
||||
" same as the host of the user browsing your pages!\n"
|
||||
msgstr ""
|
||||
|
|
@ -194,6 +196,8 @@ msgstr ""
|
|||
"o Sie können mehrere user/password Paare in einer Konfigurationsdatei\n"
|
||||
" angeben\n"
|
||||
"o Cookies werden von LinkChecker nicht akzeptiert\n"
|
||||
"o Um Proxies zu benutzen, setzen Sie $http_proxy, $https_proxy unter\n"
|
||||
" Unix oder Windows. Auf einem Mac benutzen Sie die Internet Config.\n"
|
||||
"o Beim Prüfen von 'news:' Links muß der angegebene NNTP Rechner nicht\n"
|
||||
" unbedingt derselbe wie der des Benutzers sein!\n"
|
||||
|
||||
|
|
@ -274,9 +278,6 @@ msgid ""
|
|||
"-p pwd, --password=pwd\n"
|
||||
" Try given password for HTML and FTP authorization.\n"
|
||||
" Default is 'guest@'. See -u.\n"
|
||||
"-P host[:port], --proxy=host[:port]\n"
|
||||
" Use specified proxy for HTTP requests.\n"
|
||||
" Standard port is 8080. Default is to use no proxy.\n"
|
||||
"-q, --quiet\n"
|
||||
" Quiet operation. This is only useful with -F.\n"
|
||||
"-r depth, --recursion-level=depth\n"
|
||||
|
|
@ -347,9 +348,6 @@ msgstr ""
|
|||
"-p pwd, --password=pwd\n"
|
||||
" Verwende das angegebene Passwort für HTML und FTP Authorisation.\n"
|
||||
" Standard ist 'guest@'. Siehe -u.\n"
|
||||
"-P host[:port], --proxy=host[:port]\n"
|
||||
" Verwende den angegebenen Proxy für HTTP Anfragen.\n"
|
||||
" Standard Port ist 8080. Standard ist keine Verwendung eines Proxy.\n"
|
||||
"-q, --quiet\n"
|
||||
" Keine Ausgabe. Dies ist nur in Verbindung mit -F nützlich.\n"
|
||||
"-r depth, --recursion-level=depth\n"
|
||||
|
|
|
|||
|
|
@ -164,6 +164,8 @@ msgid ""
|
|||
"o If your platform does not support threading, LinkChecker uses -t0\n"
|
||||
"o You can supply multiple user/password pairs in a configuration file\n"
|
||||
"o Cookies are not accepted by LinkChecker\n"
|
||||
"o To use proxies set $http_proxy, $https_proxy on Unix or Windows.\n"
|
||||
" On a Mac use the Internet Config.\n"
|
||||
"o When checking 'news:' links the given NNTP host doesn't need to be the\n"
|
||||
" same as the host of the user browsing your pages!\n"
|
||||
msgstr ""
|
||||
|
|
@ -182,6 +184,8 @@ msgstr ""
|
|||
"o Vous pouvez fournir plusieurs couples 'utilisateurs'/'mots de passe' dans "
|
||||
"le fichier de configuration\n"
|
||||
"o Les cookies ne sont pas acceptés par LinkChecker\n"
|
||||
"o To use proxies set $http_proxy, $https_proxy on Unix or Windows.\n"
|
||||
" On a Mac use the Internet Config.\n"
|
||||
"o Lors d'un contrôle des liens 'news:', l'hôte NNTP spécifié n'a pas besoin "
|
||||
"d'être le\n"
|
||||
" même que l'hôte de l'utilisateur qui parcourt vos pages!\n"
|
||||
|
|
@ -260,9 +264,6 @@ msgid ""
|
|||
"-p pwd, --password=pwd\n"
|
||||
" Try given password for HTML and FTP authorization.\n"
|
||||
" Default is 'guest@'. See -u.\n"
|
||||
"-P host[:port], --proxy=host[:port]\n"
|
||||
" Use specified proxy for HTTP requests.\n"
|
||||
" Standard port is 8080. Default is to use no proxy.\n"
|
||||
"-q, --quiet\n"
|
||||
" Quiet operation. This is only useful with -F.\n"
|
||||
"-r depth, --recursion-level=depth\n"
|
||||
|
|
@ -336,9 +337,6 @@ msgstr ""
|
|||
"-p pwd, --password=pwd\n"
|
||||
" Try given password for HTML and FTP authorization.\n"
|
||||
" Default is 'guest@'. See -u.\n"
|
||||
"-P host[:port], --proxy=host[:port]\n"
|
||||
" Use specified proxy for HTTP requests.\n"
|
||||
" Standard port is 8080. Default is to use no proxy.\n"
|
||||
"-q, --quiet\n"
|
||||
" Quiet operation. This is only useful with -F.\n"
|
||||
"-r depth, --recursion-level=depth\n"
|
||||
|
|
|
|||
Loading…
Reference in a new issue