Any XML champs here?

NoahFence

Banned
Joined
Mar 17, 2011
Messages
22,131
Location
Patriot Nation
I'm in the middle of trying on new hats here at work, and one thing is clear.

I don't have a programmer's brain! Grrr....

I was tasked with creating an XML file in order to better grease the wheels of billing. I need to use our program (PlanetPress) to strip out the address block and put the lines in an XML file. I've managed to create find somewhere on the internet that did that:

Set MyMeta = CreateObject("MetadataLib.MetaFile")
MyMeta.LoadFromFile Watch.GetMetadataFileName
MyMeta.Export "C:\Users\me\Desktop\PDF2XML\OUT\MetadataTest.xml",0

Unfortunately, this creates a slew of fields we just don't care about. The Dimensions of the PDF, what printer it's going on, finishing options, etc...

In an effort to trim this down into something manageable, I'm looking at the XML file and trying to figure out what is what:

<?xml version="1.0" encoding="utf-8" ?>
- <ol:Jo Version="2.1" xmlns: ol="olmeta2" xmlns:xs="http://www.w3.org/2001/XMLSchema/">
<ol :At /> <ol :Fi />- <ol :Gr Selected="true"> <ol:At /> <ol:Fi />
- <ol: Do Selected="true">- <ol:At>
<TemplateName>PDF2XML</TemplateName>
<DataFile>C:\ProgramData\Objectif Lune\PlanetPress Suite 7\PlanetPress Watch\Debug\debug25890D1F.dat</DataFile>
<Time>14:28:10-04:00</Time>
<Date>2014-03-27</Date>
<Producer>PlanetPress Suite</Producer>
<Creator>PlanetPress Design</Creator>
<TargetDevice>Fiery </TargetDevice>
<DataEncoding>MS-CP-1252</DataEncoding>
<EFRIPBooklet>TwoUp</EFRIPBooklet>
</ol:At>
- <ol:Fi>
<mdCity>BOYNTON BEACH</mdCity>
<mdState>FL</mdState>
<mdZip>33435</mdZip>
</ol:Fi>
- <ol: Da Selected="true" Offset="0" RealIndex="0">
- <ol:At>
<ol__internalname_offset>0</ol__internalname_offset>
<ol__internalname_realindex>0</ol__internalname_realindex>
</ol:At>
- <ol:Fi>
<Name>Name</Name>
<Address1>22 Main Street</Address1>
<Address2>#312</Address2>
<Address3>BOYNTON BEACH, FL 33435</Address3>
<Address4 />
</ol:Fi>
- <ol:Pa>
- <ol:At>
<Dimensions>612:792</Dimensions>
<Orientation>Rotate0</Orientation>
<MediaColor />
<Weight />
<MediaType />
<Side>Front</Side>
<Duplex />
<InputSlot />
<OutputBin />
<Jog />
<Staple />
</ol:At>
<ol:Fi />
</ol: Pa>
- <ol: Pa>
- <ol:At>
<Orientation>Rotate0</Orientation>
<Dimensions>612:792</Dimensions>
<Weight />
<MediaColor />
<MediaType />
<EFRIPBooklet>TwoUp</EFRIPBooklet>
<Side>Front</Side>
<Duplex />
<InputSlot />
<OutputBin />
<Jog />
<Staple />
</ol:At>
<ol:Fi />
</ol: Pa>
</ol: Da>
</ol: Do>
- <ol: Do Selected="true">
- <ol:At>

Questions -
1 I know that ol in HTML is "ordered list" and given the appearance of the XML file I'm sure that's what it means here. But what does the :At and :Fi and others mean? I'm just trying to understand it a bit more.

2 Is there a way to strip out all that extraneous mumbo-jumbo? The media types, the finishing, etc... and just leave the address? (Pay no mind to the extra address in the sample, as it's me playing around trying to figure stuff out)


super thanks for any help!
 
Last edited:
From what I can tell, the XML element names are whatever the programmers thought would be a good idea at the time. One hint is at the top of your file, in the bit that reads: xmlns: ol="olmeta2". This defines an XML namespace, called "olmeta2" that uses "ol" as shorthand for a prefix. That means that when you see something like "<ol:At>", it's shorthand for a predefined data structure in the "olmeta" namespace- but without details on the namespace (which I couldn't find on google).

As for the extraneous mumbo jumbo- it appears that the XML document is a template for printing a document, so I don't know that there would necessarily be a way to strip out information such as formatting (unless you want to write your own program to parse the XML and output a different document).
 
From what I can tell, the XML element names are whatever the programmers thought would be a good idea at the time. One hint is at the top of your file, in the bit that reads: xmlns: ol="olmeta2". This defines an XML namespace, called "olmeta2" that uses "ol" as shorthand for a prefix. That means that when you see something like "<ol:At>", it's shorthand for a predefined data structure in the "olmeta" namespace- but without details on the namespace (which I couldn't find on google).

That makes sense.


As for the extraneous mumbo jumbo- it appears that the XML document is a template for printing a document, so I don't know that there would necessarily be a way to strip out information such as formatting (unless you want to write your own program to parse the XML and output a different document).

Indeed it is going to print a document. Later on in the workflow they'd like me to create their XML file for their needs. Sort of an all-encompassing workflow. I think writing my own program is going to be the way to go.

Prepress is more than just making PDFs look pretty nowadays... :(

Thanks!
 
After googling "PlanetPress XML" my hot guess is that ol stands for ObjectifLune, the company that produces the software. I suggest you ask for the manuals and *cough* read them.
 
That sort of advice is known in the industry as "RTFM". ;)


Hehehe, exactly. There's a bit of deja-vu going on with this, though. Not the first time the guy asks for this kind of "help". Check this, that or the other thread.

Funny enough, NoahFence, in the latter (oldest) thread someone recommends something to you that I was about to recommend as well (but then decided to give you a RTFM first ;)) - and you said you bookmarked it. I recently came across it myself and it's an awesome learning site judged by its JavaScript track: www.codecademy.com

I'd recommend first the HTML/CSS, and then the JavaScript. You won't get around doing your homework if you want to make progress with that stuff - and if it isn't your job, it's certainly not ours. ;)

Have fun, it's a real rocker, and if you register you even get badges. :)
 
Last edited:
Hehehe, exactly. There's a bit of deja-vu going on with this, though. Not the first time the guy asks for this kind of "help". Check this, that or the other thread.

Funny enough, NoahFence, in the latter (oldest) thread someone recommends something to you that I was about to recommend as well (but then decided to give you a RTFM first ;)) - and you said you bookmarked it. I recently came across it myself and it's an awesome learning site judged by its JavaScript track: www.codecademy.com

I'd recommend first the HTML/CSS, and then the JavaScript. You won't get around doing your homework if you want to make progress with that stuff - and if it isn't your job, it's certainly not ours. ;)

Have fun, it's a real rocker, and if you register you even get badges. :)
the problem is, i'm not a programmer, nor do I have the kind of innate ability to pick up that sort of thing easily.

I'm also working with deadlines.

Let's say i'm tasked with removing blacktop. I have no idea how to do it. First, I need to figure out what tool to use, even though I've no experience with blacktop. Never seen it.

Well, eventually I figure out what tool to use. a jackhammer. Fine.

Now I just need to figure out how to use a jackhammer. Under a deadline, it's all pretty daunting. Not impossible, just daunting.
 

Back
Top Bottom