problems when importing bibtex entries
Brice Goglin
2007-03-05 13:04:27 UTC

I am trying Referencer since it looks very promising. While importing my
existing bibtex files, I always get an error "Invalid byte sequence in
conversion input - This problem was encountered while parsing import".
For instance, when the file contains:

AUTHOR = { Martin Steckermeier and Frank Bellosa },
TITLE = { Using Locality Information in Userlevel Scheduling },
INSTITUTION = { University of Erlangen-N?rnberg -- Computer Science
Department -- Operating Systems -- {IMMD IV} },
YEAR = 1995,
NUMBER = { TR-95-14 },
NOTE = { \url{mstecker at informatik.uni-erlangen.de}
\url{bellosa at informatik.uni-erlangen.de} },
ADDRESS = { Martensstra?e 1, 91058 Erlangen, Germany },
DAY = 23,

I get this error with the following output in the terminal
Publisher =

I tried adding a dummy publisher field, but it does not help. And it
seems to me that all the required fields for a "techreport" entry are
set in my file (I looked in Emacs' bibtex mode to know which ones are

Any idea?

John Spray
2007-03-05 13:38:10 UTC
Post by Brice Goglin
I am trying Referencer since it looks very promising. While importing my
existing bibtex files, I always get an error "Invalid byte sequence in
conversion input - This problem was encountered while parsing import".
AUTHOR = { Martin Steckermeier and Frank Bellosa },
TITLE = { Using Locality Information in Userlevel Scheduling },
INSTITUTION = { University of Erlangen-N?rnberg -- Computer Science
Department -- Operating Systems -- {IMMD IV} },
YEAR = 1995,
NUMBER = { TR-95-14 },
NOTE = { \url{mstecker at informatik.uni-erlangen.de}
\url{bellosa at informatik.uni-erlangen.de} },
ADDRESS = { Martensstra?e 1, 91058 Erlangen, Germany },
DAY = 23,
Hmm, when I import this snippet from a UTF-8 encoded file, it works but
the ? and so on get manged. When I save it as an iso-8859-1 file it
just works, output on the console like:
Publisher = University of Erlangen-N?rnberg ? Computer Science
Department ? Operating Systems ? IMMD IV (0)
Note = \urlmstecker at informatik.uni-erlangen.de
\urlbellosa at informatik.uni-erlangen.de (0)
Address = Martensstra?e 1, 91058 Erlangen, Germany (0)
Month = DEC(0)
Day = 23(0)

What is your LANG environment variable set to? What encoding is the
bibtex file in?

Brice Goglin
2007-03-05 14:58:12 UTC
Post by John Spray
Hmm, when I import this snippet from a UTF-8 encoded file, it works but
the ? and so on get manged.
I should have chosen another example, these German character could
confuse the discussion :)

I don't have any problem with these special characters here. After
searching a little bit, I found out that the failure is caused by the
double-dash in the institution:
INSTITUTION = { University of Erlangen-N?rnberg -- Computer Science
Department - Operating Systems - {IMMD IV} },
=> fails
INSTITUTION = { University of Erlangen-N?rnberg - Computer Science
Department - Operating Systems - {IMMD IV} },
=> works

Apart from this problem, I finally manage to locate where my other
failing entries had a problem. It seems that Referencer does not like
having a quote in the publisher or booktitle field. I had several
entries with "O'Reilly" or "Developer's" in the publisher or booktitle
field, this get accepted once I remove the quote. Having a quote in the
title or author does not seem to cause any problem. Here are the failing
entries, in case you want to look at it:

@misc{ braam99intermezzo,
author = "Peter Braam Braam",
title = "{T}he {I}nter{M}ezzo {F}ile {S}ystem",
booktitle = "Proceedings of the O'Reilly Perl Conference 3",
year = "1999",
url = "http://citeseer.nj.nec.com/braam99intermezzo.html"

@Book{ bovet03understanding,
author = {Daniel P. Bovet and Marco Cesati},
title = "{U}nderstanding the {L}inux {K}ernel, {S}econd {E}dition",
publisher = "O'Reilly",
year = 2003,
isbn = "0-596-00213-0",

@Book{ love04linux,
author = {Robert Love},
title = "{L}inux {K}ernel {D}evelopment",
publisher = "Developer's Library, Sams Publishing",
year = 2004,
isbn = "0-672-32512-8",

All of them have been used in earlier publications without ever getting
a problem with bibtool or bibtex from what I remember. I don't how
whether Referencer uses its own parser or something common to
bibtex/bibtool anyway...
Post by John Spray
What is your LANG environment variable set to? What encoding is the
bibtex file in?
I don't know if it matters anymore, but in case it does:

I don't have any LANG set. I just have the following config:
LC_CTYPE=fr_FR at euro
LC_TIME=fr_FR at euro

Setting LANG to en_US before starting referencer does not seem to help

By the way, it would be great to display the parsing error in the error
window instead of in the terminal :)

thanks a lot,
Brice Goglin
2007-03-05 15:08:26 UTC
Post by Brice Goglin
Post by John Spray
Hmm, when I import this snippet from a UTF-8 encoded file, it works but
the ? and so on get manged.
I should have chosen another example, these German character could
confuse the discussion :)
I don't have any problem with these special characters here. After
searching a little bit, I found out that the failure is caused by the
INSTITUTION = { University of Erlangen-N?rnberg -- Computer Science
Department - Operating Systems - {IMMD IV} },
=> fails
INSTITUTION = { University of Erlangen-N?rnberg - Computer Science
Department - Operating Systems - {IMMD IV} },
=> works
Apart from this problem, I finally manage to locate where my other
failing entries had a problem. It seems that Referencer does not like
having a quote in the publisher or booktitle field. I had several
entries with "O'Reilly" or "Developer's" in the publisher or booktitle
field, this get accepted once I remove the quote. Having a quote in the
title or author does not seem to cause any problem. Here are the failing
@misc{ braam99intermezzo,
author = "Peter Braam Braam",
title = "{T}he {I}nter{M}ezzo {F}ile {S}ystem",
booktitle = "Proceedings of the O'Reilly Perl Conference 3",
year = "1999",
url = "http://citeseer.nj.nec.com/braam99intermezzo.html"
@Book{ bovet03understanding,
author = {Daniel P. Bovet and Marco Cesati},
title = "{U}nderstanding the {L}inux {K}ernel, {S}econd {E}dition",
publisher = "O'Reilly",
year = 2003,
isbn = "0-596-00213-0",
@Book{ love04linux,
author = {Robert Love},
title = "{L}inux {K}ernel {D}evelopment",
publisher = "Developer's Library, Sams Publishing",
year = 2004,
isbn = "0-672-32512-8",
To bring more confusion, there are some entries where the quote is
accepted in booktitle. For instance:

@inproceedings{ schmuck02gpfs,
author = "Frank Schmuck and Roger Haskin",
title = "{GPFS}: {A} {S}hared-{D}isk {F}ile {S}ystem for {L}arge
{C}omputing {C}lusters",
booktitle = "Proceedings of the Conference on File and Storage
Technologies (FAST'02)",
publisher = "USENIX, Berkeley, CA",
pages = "231--244",
year = "2002",
month = JAN,
address = "Monterey, CA",

Brice Goglin
2007-03-06 20:38:31 UTC
Post by Brice Goglin
To bring more confusion, there are some entries where the quote is
@inproceedings{ schmuck02gpfs,
author = "Frank Schmuck and Roger Haskin",
title = "{GPFS}: {A} {S}hared-{D}isk {F}ile {S}ystem for {L}arge
{C}omputing {C}lusters",
booktitle = "Proceedings of the Conference on File and Storage
Technologies (FAST'02)",
publisher = "USENIX, Berkeley, CA",
pages = "231--244",
year = "2002",
month = JAN,
address = "Monterey, CA",
And last thing for now:

When exporting the database as a bibtex file, most quotes are refused,
including the one above, giving the following error:

escapeBibtexAccents '
(Referencer:6870): glibmm-CRITICAL **:
unhandled exception (type Glib::Error) in signal handler:
domain: g_convert_error
code : 1
what : Invalid byte sequence in conversion input

By the way, I also got an error when exporting an entry whose title
field contained some latex code ($\null^2$). This entry works fine in

John Spray
2007-03-06 23:53:12 UTC
Post by Brice Goglin
When exporting the database as a bibtex file, most quotes are refused,
escapeBibtexAccents '
domain: g_convert_error
code : 1
what : Invalid byte sequence in conversion input
By the way, I also got an error when exporting an entry whose title
field contained some latex code ($\null^2$). This entry works fine in
Thanks for the detailed report -- I'll start working through these items
in due course.

