Discussion:
Metadata retrival from SciELO
Leonardo F. Fontenelle
2008-08-22 02:53:42 UTC
Permalink
I would like Referencer to be able to retrieve metadata from SciELO
[http://www.scielo.org], which publishes the full content of many high
quality scientific Latin American journals.

I know nothing about the technical details, but I guess the fact that
ScieLO supports OAI-PMH [http://www.scielo.br/oai] means it is possible
write a Referencer plugin.

Thanks in advance!
--
Leonardo Fontenelle
http://leonardof.org
Leonardo F. Fontenelle
2008-12-01 01:50:53 UTC
Permalink
Post by Leonardo F. Fontenelle
I would like Referencer to be able to retrieve metadata from SciELO
[http://www.scielo.org], which publishes the full content of many high
quality scientific Latin American journals.
[...]
Lucky me, I do pine for the days when men were men and wrote their own
plugins. This is my first attempt to provide a Referencer plugin to
retreive metadata from SciELO.

Even if I was not able to "import referencer" in the python console, I
tried most of the plugin code before I launched referencer with it. To
my surprise, Referencer gave me this error message:


setting pythonPath
to :./plugins:/home/leonardof/.referencer/plugins:/usr/lib/referencer:
void PluginManager::scan(const std::string&):
found module scielo
void PythonPlugin::printException():
Exce??o: <type 'exceptions.SyntaxError'>

M?dulo: scielo
Explica??o: ('invalid syntax',
('/home/leonardof/.referencer/plugins/scielo.py', 123, 62, ' for node
in metadata_xml.getElementsByTagName("pub-date"):\n'))

virtual void PythonPlugin::load(const std::string&):
Plugin::load: Couldn't import module


The code snippet is:

def resolve_metadata(doc, method):
[...]
xmldoc = minidom.ParseString(xml)
metadata_xml = xmldoc.getElementsByTagName('front')[0]
[...]
Post by Leonardo F. Fontenelle
for node in metadata_xml.getElementsByTagName("pub-date"): <<<
if node.getAttribute('pub-type') != "pub" : continue
year = node.getElementsByTagName('year')[0].firstChild.data

if year:
doc.set_field('year', year)
else:
print "scielo.resolve_metadata:",
"publication has no printed publication year."


I can't understand what's wrong here; the python console (2.6) doesn't
complain, and there's a similar "for" loop before this one.

I'd appreciate a lot if someone could help me fix this syntax error.
Latter, I'll have a few questions which are more related to referencer
than to python.

Thanks in advance!
--
Leonardo Fontenelle
http://leonardof.org
Leonardo F. Fontenelle
2008-12-01 01:58:15 UTC
Permalink
I opened a "bug report" to make the plugin code available online:

https://bugs.launchpad.net/referencer/+bug/303839
Post by Leonardo F. Fontenelle
Post by Leonardo F. Fontenelle
I would like Referencer to be able to retrieve metadata from SciELO
[http://www.scielo.org], which publishes the full content of many high
quality scientific Latin American journals.
[...]
Lucky me, I do pine for the days when men were men and wrote their own
plugins. This is my first attempt to provide a Referencer plugin to
retreive metadata from SciELO.
Even if I was not able to "import referencer" in the python console, I
tried most of the plugin code before I launched referencer with it. To
[...]
I'd appreciate a lot if someone could help me fix this syntax error.
Latter, I'll have a few questions which are more related to referencer
than to python.
Thanks in advance!
--
Leonardo Fontenelle
http://leonardof.org
Leonardo F. Fontenelle
2008-12-12 16:23:48 UTC
Permalink
Post by Leonardo F. Fontenelle
https://bugs.launchpad.net/referencer/+bug/303839
Ten days ago I fixed some syntax errors in the plugin, but it still
doesn't work correctly. I suspected Referencer's python API has an issue
with Unicode, but I didn't get any reply yet. Could anyone please check
if Referencer's python API really has an issue with Unicode, or if the
issue is in the plugin itself? (Just follow the link above.)

Thanks!
--
Leonardo Fontenelle
http://leonardof.org
John Spray
2008-12-12 18:09:54 UTC
Permalink
Leonardo,

I would not be surprised if referencer's python interface has text
encoding issues. I've been very busy lately, and will continue to be
so for some time. Suggest you have a look at the pubmed plugin, it
seems to do some encoding operations, search for 'encode("utf-8")'

Regards,
John
Leonardo F. Fontenelle
2008-12-13 12:47:20 UTC
Permalink
Post by John Spray
I would not be surprised if referencer's python interface has text
encoding issues. I've been very busy lately, and will continue to be
so for some time. Suggest you have a look at the pubmed plugin, it
seems to do some encoding operations, search for 'encode("utf-8")'
Thank you, that fixed it! (I wonder if we'll need it when Python 3.0
becomes widely used.) I'll discuss a few semantic issues here in the
list and then attach the new plugin to the bug tracker.
--
Leonardo Fontenelle
http://leonardof.org
Loading...