From SemanticLab
Connection
- the development version of geoLyzard is hosted on gecko.wu.ac.at
- for connecting to the service create a SSH tunnel to xmdimrill and forward the ports
ssh user@xmdimrill.wu.ac.at -L 8080:gecko.wu.ac.at:8180
Usage
- the service can be used with any XML-RPC Client
from xmlrpclib import ServerProxy, Error
FILE = 'example.xml'
xml_cont = open(FILE,'r').read().strip()
GEO_TAGGER_URL = 'http://localhost:8080/geotagger/geotagger'
enc_dict = {}
for i in xrange(1,2):
enc_dict[str(i)]=base64.b64encode(xml_cont)
"""
geoLocation takes a Gazetteer as parameter and the an encoded xml-file
valid gazetteers: C, C5000, C10000, c50000, C100000, C500000
"""
annotations = xmlrpc_server.Tagger.getGeoLocation("C1000000", enc_dict)
- two disambiguation algorithms are available: Default and Amitay
- setting the disambiguation algorithm with Python:
from xmlrpclib import ServerProxy, Error
GEO_TAGGER_URL = 'http://localhost:8080/geotagger/geotagger'
xmlrpc_server = ServerProxy(GEO_TAGGER_URL)
xmlrpc_server.Tagger.setDisambiguationAlgorithm('Amitay')
# or
xmlrpc_server.Tagger.setDisambiguationAlgorithm('Default')
- depending on the algorithm several parameters are availabe:
- Amitay:
- minFocusPoints - Minimum of focus points, values below, will be removed from the result (default: 0.0)
- nonAmbigiousConfidence - Confidence for non ambiguous locations (default: 0.5)
- maxPopulationConfidence - Maximum population confidence (default: 0.5)
- minPopulationConfidence - Minimum population confidence (default: 0.25)
- discountFactor -Discount factor for calculating focus points of parents (default: 0.7)
- Default:
- narrowLocationBias (default: 3.3)
# example setting a parameter
xmlrpc_server.Tagger.setDisambiguationParameters('minFocusPoints', '100.0')