Skip to content

Ignore DTDs #3

@superdude264

Description

@superdude264

In trying to get Pub2TEI working on the grobid gold standard data from PMC, I ran into the DTD issues mentioned in the README. After some research, I was able to discover that DTD loading can be disabled with the following switch:

--parserFeature?uri=http%3A//apache.org/xml/features/nonvalidating/load-external-dtd:false

References:

I've attached a file from the grobid PMC gold standard data I was having trouble with. The new switch allows the conversion to proceed.

sample.zip


The sample command in the README could be updated to:

java -jar Samples/saxon9he.jar \
	--parserFeature?uri=http%3A//apache.org/xml/features/nonvalidating/load-external-dtd:false \
	-a:off \
	-dtd:off \
	-expand:off \
	-o:out.tei.xml \
	-s:Samples/TestPubInput/BMJ/bmj_sample.xml \
	-t \
	-xsl:Stylesheets/Publishers.xsl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions